Deep Neural Network Acceleration Framework Under Hardware Uncertainty
نویسندگان
چکیده
Deep Neural Networks (DNNs) are known as effective model to perform cognitive tasks. However, DNNs are computationally expensive in both train and inference modes as they require the precision of floating point operations. Although, several prior work proposed approximate hardware to accelerate DNNs inference, they have not considered the impact of training on accuracy. In this paper, we propose a general framework called FramNN , which adjusts DNN training model to make it appropriate for underlying hardware. To accelerate training FramNN applies adaptive approximation which dynamically changes the level of hardware approximation depending on the DNN error rate. We test the efficiency of the proposed design over six popular DNN applications. Our evaluation shows that in inference, our design can achieve 1.9× energy efficiency improvement and 1.7× speedup while ensuring less than 1% quality loss. Similarly, in training mode FramNN can achieve 5.0× energy-delay product improvement as compared to baseline AMD GPU.
منابع مشابه
Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks
Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks Convolutional neural networks (CNN) have achieved major breakthroughs in recent years. Their performance in computer vision have matched and in some areas even surpassed human capabilities. Deep neural networks can capture complex non-linear features; however this ability comes at the cost of high computational and memo...
متن کاملTowards a Low Power Hardware Accelerator for Deep Neural Networks
In this project, we take a first step towards building a low power hardware accelerator for deep learning. We focus on RBM based pretraing of deep neural networks and show that there is significant robustness to random errors in the pre-training, training and testing phase of using such neural networks. We propose to leverage such robustness to build accelerators using low power but possibly un...
متن کاملA Shallow Network with Combined Pooling for Fast Traffic Sign Recognition
Traffic sign recognition plays an important role in intelligent transportation systems. Motivated by the recent success of deep learning in the application of traffic sign recognition, we present a shallow network architecture based on convolutional neural networks (CNNs). The network consists of only three convolutional layers for feature extraction, and it learns in a backward optimization wa...
متن کاملAnalyzing and Mitigating the Impact of Permanent Faults on a Systolic Array Based Neural Network Accelerator
Due to their growing popularity and computational cost, deep neural networks (DNNs) are being targeted for hardware acceleration. A popular architecture for DNN acceleration, adopted by the Google Tensor Processing Unit (TPU), utilizes a systolic array based matrix multiplication unit at its core. This paper deals with the design of faulttolerant, systolic array based DNN accelerators for high ...
متن کاملTransfer Learning with Binary Neural Networks
Previous work has shown that it is possible to train deep neural networks with low precision weights and activations. In the extreme case it is even possible to constrain the network to binary values. The costly floating point multiplications are then reduced to fast logical operations. High end smart phones such as Google’s Pixel 2 and Apple’s iPhone X are already equipped with specialised har...
متن کامل